LifePrint: a novel k-tuple distance method for construction of phylogenetic trees
نویسندگان
چکیده
PURPOSE Here we describe LifePrint, a sequence alignment-independent k-tuple distance method to estimate relatedness between complete genomes. METHODS We designed a representative sample of all possible DNA tuples of length 9 (9-tuples). The final sample comprises 1878 tuples (called the LifePrint set of 9-tuples; LPS9) that are distinct from each other by at least two internal and noncontiguous nucleotide differences. For validation of our k-tuple distance method, we analyzed several real and simulated viroid genomes. Using different distance metrics, we scrutinized diverse viroid genomes to estimate the k-tuple distances between these genomic sequences. Then we used the estimated genomic k-tuple distances to construct phylogenetic trees using the neighbor-joining algorithm. A comparison of the accuracy of LPS9 and the previously reported 5-tuple method was made using symmetric differences between the trees estimated from each method and a simulated "true" phylogenetic tree. RESULTS The identified optimal search scheme for LPS9 allows only up to two nucleotide differences between each 9-tuple and the scrutinized genome. Similarity search results of simulated viroid genomes indicate that, in most cases, LPS9 is able to detect single-base substitutions between genomes efficiently. Analysis of simulated genomic variants with a high proportion of base substitutions indicates that LPS9 is able to discern relationships between genomic variants with up to 40% of nucleotide substitution. CONCLUSION Our LPS9 method generates more accurate phylogenetic reconstructions than the previously proposed 5-tuples strategy. LPS9-reconstructed trees show higher bootstrap proportion values than distance trees derived from the 5-tuple method.
منابع مشابه
A Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree ConstructionA Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction
Phylogenies are useful for organizing knowledge of biological diversity, for structuring classifications, and for providing knowledge of events that occurred during evolution. Different phylogenetic reconstruction techniques are available. In this paper Distance based technique is used. Distance measure is an important issue in phylogenetic analysis. Traditional approaches are time-consuming du...
متن کاملA Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction
Phylogenies are useful for organizing knowledge of biological diversity, for structuring classifications, and for providing knowledge of events that occurred during evolution. Different phylogenetic reconstruction techniques are available. In this paper Distance based technique is used. Distance measure is an important issue in phylogenetic analysis. Traditional approaches are time-consuming du...
متن کاملPerformance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction
Phylogenetic tree reconstruction requires construction of a multiple sequence alignment (MSA) from sequences. Computationally, it is difficult to achieve an optimal MSA for many sequences. Moreover, even if an optimal MSA is obtained, it may not be the true MSA that reflects the evolutionary history of the underlying sequences. Therefore, errors can be introduced during MSA construction which i...
متن کاملA novel approach to phylogenetic trees: d-Dimensional geometric Steiner trees
We suggest a novel distance-based method for the determination of phylogenetic trees. It is based on multidimensional scaling and Euclidean Steiner trees in high-dimensional spaces. Preliminary computational experience shows that the use of Euclidean Steiner trees for finding phylogenetic trees is a viable approach. Experiments also indicate that the new method is comparable with results produc...
متن کامل$k$-tuple total restrained domination/domatic in graphs
For any integer $kgeq 1$, a set $S$ of vertices in a graph $G=(V,E)$ is a $k$-tuple total dominating set of $G$ if any vertex of $G$ is adjacent to at least $k$ vertices in $S$, and any vertex of $V-S$ is adjacent to at least $k$ vertices in $V-S$. The minimum number of vertices of such a set in $G$ we call the $k$-tuple total restrained domination number of $G$. The maximum num...
متن کامل